AITopics

2506.09069

Genre: Research Report (0.64)

Industry: Education > Educational Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Chhablani, Chirag, Sharma, Nikhita, Hosier, Jordan, Gurbani, Vijay K.

Digits micro-model for accurate and secure transactions

arXiv.org Artificial IntelligenceFeb-2-2024

Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google Speech-to-text (STT), or Amazon Transcribe -- or open source (OpenAI's Whisper). Such ASR models are trained on hundreds of thousands of hours of audio data and require considerable resources to run. Despite recent progress large speech recognition models, we highlight the potential of smaller, specialized "micro" models. Such light models can be trained perform well on number recognition specific tasks, competing with general models like Whisper or Google STT while using less than 80 minutes of training time and occupying at least an order of less memory resources. Also, unlike larger speech recognition models, micro-models are trained on carefully selected and curated datasets, which makes them highly accurate, agile, and easy to retrain, while using low compute resources. We present our work on creating micro models for multi-digit number recognition that handle diverse speaking styles reflecting real-world pronunciation patterns. Our work contributes to domain-specific ASR models, improving digit recognition accuracy, and privacy of data. An added advantage, their low resource consumption allows them to be hosted on-premise, keeping private data local instead uploading to an external cloud. Our results indicate that our micro-model makes less errors than the best-of-breed commercial or open-source ASRs in recognizing digits (1.8% error rate of our best micro-model versus 5.8% error rate of Whisper), and has a low memory footprint (0.66 GB VRAM for our model versus 11 GB VRAM for Whisper).

dataset, recognition, utterance, (16 more...)

2402.01931

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Asish, Kottakota, Teja, P. Sarath, Chander, R. Kishan, Hema, Dr. D. Deva

NeuroWrite: Predictive Handwritten Digit Classification using Deep Neural Networks

arXiv.org Artificial IntelligenceNov-2-2023

The rapid evolution of deep neural networks has revolutionized the field of machine learning, enabling remarkable advancements in various domains. In this article, we introduce NeuroWrite, a unique method for predicting the categorization of handwritten digits using deep neural networks. Our model exhibits outstanding accuracy in identifying and categorising handwritten digits by utilising the strength of convolutional neural networks (CNNs) and recurrent neural networks (RNNs).In this article, we give a thorough examination of the data preparation methods, network design, and training methods used in NeuroWrite. By implementing state-of-the-art techniques, we showcase how NeuroWrite can achieve high classification accuracy and robust generalization on handwritten digit datasets, such as MNIST. Furthermore, we explore the model's potential for real-world applications, including digit recognition in digitized documents, signature verification, and automated postal code recognition. NeuroWrite is a useful tool for computer vision and pattern recognition because of its performance and adaptability.The architecture, training procedure, and evaluation metrics of NeuroWrite are covered in detail in this study, illustrating how it can improve a number of applications that call for handwritten digit classification. The outcomes show that NeuroWrite is a promising method for raising the bar for deep neural network-based handwritten digit recognition.

dataset, handwritten digit, recognition, (15 more...)

2311.01022

Country: Asia > India > Tamil Nadu > Chennai (0.05)

Genre: Research Report > Promising Solution (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 18:47:33 GMT

Using a neural net to instantiate a deformable model

Deformable models are an attractive approach to recognizing non(cid:173) rigid objects which have considerable within class variability. How(cid:173) ever, there are severe search problems associated with fitting the models to data. We show that by using neural networks to provide better starting points, the search time can be significantly reduced. The method is demonstrated on a character recognition task. In previous work we have developed an approach to handwritten character recogni(cid:173) tion based on the use of deformable models (Hinton, Williams and Revow, 1992a; Revow, Williams and Hinton, 1993).

cid, deformable model, instantiation parameter, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

Chhikara, Prateek, Kuhar, Harshul, Goyal, Anil, Sharma, Chirag

DIGITOUR: Automatic Digital Tours for Real-Estate Properties

arXiv.org Artificial IntelligenceJan-16-2023

A virtual or digital tour is a form of virtual reality technology which allows a user to experience a specific location remotely. Currently, these virtual tours are created by following a 2-step strategy. First, a photographer clicks a 360 degree equirectangular image; then, a team of annotators manually links these images for the "walkthrough" user experience. The major challenge in the mass adoption of virtual tours is the time and cost involved in manual annotation/linking of images. Therefore, this paper presents an end-to-end pipeline to automate the generation of 3D virtual tours using equirectangular images for real-estate properties. We propose a novel HSV-based coloring scheme for paper tags that need to be placed at different locations before clicking the equirectangular images using 360 degree cameras. These tags have two characteristics: i) they are numbered to help the photographer for placement of tags in sequence and; ii) bi-colored, which allows better learning of tag detection (using YOLOv5 architecture) in an image and digit recognition (using custom MobileNet architecture) tasks. Finally, we link/connect all the equirectangular images based on detected tags. We show the efficiency of the proposed pipeline on a real-world equirectangular image dataset collected from the Housing.com database.

artificial intelligence, equirectangular image, machine learning, (15 more...)

doi: 10.1145/3570991.3571060

2301.0668

Country:

Asia > India > Maharashtra > Mumbai (0.06)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.68)

#artificialintelligenceMay-14-2021, 10:25:15 GMT

Automatic Sudoku (Number Place) Solver with Digit Recognition and Integer Linear Programming

Sudoku is a logic-based number placement puzzle that consists of 81 cells which are divided into 9 columns, rows and blocks. The goal of this game is to fill out each cells with numbers 1–9 so that there are no repeating numbers in each row, column and blocks. In this post, I aim to introduce a digit recognition and integer linear programming based automatic sudoku solver that uses the following: Keras (based on the MNIST database [1]) and OpenCV for digit recognition and PuLP for integer linear programming. The database is also widely used for training and testing in the field of machine learning. In this section, I explain the overview of image processing for digit recognition.

digit recognition, recognition and integer linear programming, sudoku puzzle, (7 more...)

Industry: Leisure & Entertainment > Games > Sudoku (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.91)

#artificialintelligenceMar-29-2021, 13:05:23 GMT

AI Sudoku Solver

Sudoku is a puzzle in which players insert the numbers one to nine into a grid consisting of nine squares subdivided into a further nine smaller squares in such a way that every number appears once in each horizontal line, vertical line, and square. Using OpenCV, Deep Learning, and Backtracking Algorithm, We can solve the sudoku puzzle. First, build the Character Recognition model that can extract digits from a Sudoku grid image and then work on a backtracking approach to solve it. Deep Learning-based AI_Sudoku_Solver architecture uses OpenCV (opencv 4.2.0) and Python (python 3.7). The model Convolution Neural Network(CNN) uses Keras (keras 2.3.1) on Tensorflow for Digit Recognition.

ai sudoku solver, constraint, grid, (8 more...)

Industry: Leisure & Entertainment > Games > Sudoku (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceJan-29-2020, 21:31:46 GMT

Digit Recognition: A Beginner's Guide to Keras

Over the last decade, the use of artificial neural networks (ANNs) has increased considerably. People have used ANNs in medical diagnoses, to predict Bitcoin prices, and to create fake Obama videos! With all the buzz about deep learning and artificial neural networks, haven't you always wanted to create one for yourself? In this tutorial, we'll create a model to recognize handwritten digits We use the keras library for training the model in this tutorial. Keras is a high-level library in Python that is a wrapper over TensorFlow, CNTK and Theano.

dataset, digit, neural network, (16 more...)

Industry: Banking & Finance > Trading (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

#artificialintelligenceApr-11-2019, 17:46:14 GMT

Kickstart your experiments from examples - Azure Machine Learning Studio

For example, to browse experiments that use a PCA-based anomaly detection algorithm: Under Categories click Experiment. Then, under Algorithms Used, click Show all and in the dialog box choose PCA-Based Anomaly Detection. You may have to scroll to see it. For example, to find experiments contributed by Microsoft related to digit recognition that use a two-class support vector machine algorithm, enter "digit recognition" in the search box. For example, to browse experiments that use a PCA-based anomaly detection algorithm: Under Categories click Experiment.

data mining, experiment, machine learning, (10 more...)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)

Halkias, Xanadu, Paris, Sebastien, Glotin, Herve

Sparse Penalty in Deep Belief Networks: Using the Mixed Norm Constraint

arXiv.org Machine LearningFeb-22-2013

Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the DBN. In this paper we present a theoretical approach for sparse constraints in the DBN using the mixed norm for both non-overlapping and overlapping groups. We explore how these constraints affect the classification accuracy for digit recognition in three different datasets (MNIST, USPS, RIMES) and provide initial estimations of their usefulness by altering different parameters such as the group size and overlap percentage.

activation probability, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1301.3533

Country: North America > United States (0.37)

Genre: Research Report (0.40)

Industry:

Government > Post Office (0.51)
Government > Regional Government > North America Government > United States Government (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)